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Abstract — Leaves images segmentation is an important 
task in the automated plant identification. Images leaf 
segmentation is the process of extracting the leaf from its 
background, which is a challenging task. In this paper, we 
propose an efficient and effective new approach for leaf 
image segmentation, we aim to separate the leaves from 
the background and from their shadow generated when 
the photo was taken. The proposed approach calculates 
the local descriptors for the image that will be classified 
for the separation of the different image’s region. We use 
Pseudo Zernike Moments (PZM) as a local descriptor 
combined with K-means algorithm for clustering. The 
efficient of PZM for features extraction lead to very good 
results in very short time. The validation tests applied on 
a variety of images, showed the ability of the proposed 
approach for segmenting effectively the image. The 
results demonstrate a real improvement compared to 
those of new existing segmentation method. 

Index Terms — Pseudo Zernike Moments, leaves plant, 
image segmentation, K-means algorithm. 


I. Introduction 

Plants are essential creatures in our planet, they are our 
nearest environment on which depends several life 
aspects such as food, oxygen, water, medicine. In our 
days the plants are increasingly threatened, lead to their 
loss which has a devastating impact on human life. In 
order to protect plants we need to know more about them 
and disseminated more knowledge, even for non- 
specialists; but their large numbers and their diversity are 
a challenge even for the specialists who cannot know or 
remember only a limited number. 

Plant identification methods are based on the use of 
taxonomy. The taxonomy is used by the specialists who 
examined the plants for identification. The identification 
methods can be divided into two broad categories: The 
first one is called the modern methods, but they are 
complex and can be handled only by specialists since 
they consider biological characteristics. The second one 
is called traditional methods based on the visual 
identification of the form of an important organ of the 
plant such as leaf, flower or fruit and identifies it through 
this feature. 

The leaves are considered the fundamental parameter 
for plant identification [1], since they are available ah 


year round in almost ah seasons, they do not require three 
dimensional acquisitions since the form of a leaf can be 
retained in a two-dimensional image [2]. That’s what 
justifies their wide applications for automatic 
identification which handle only two-dimensional images. 

The leaves possess several characteristics such as 
shape, color, veins and texture [3] [4]. Lorm is the most 
used feature for plant identification, it is a characteristic 
often inherited and not influenced by the environment [5]. 
Leaf shape allows a better description of the leaves from 
other characteristics such as color or even texture [1]. 
Therefore, for leaf identification we need to extract leaf 
from the background. The extraction of the leaf from the 
image and recover its form, is a very significant step in 
the identification process. Most of leaves images used 
have generally a uniform background; however the 
segmentation of the leaf from the background remains a 
challenge due to the noise produced by the brightness 
variation and shadow produced by the leaves themselves. 
Our goal is to propose an efficient method for leaf 
segmentation, which allows extracting leaf without 
shadow or background. 

In this paper, we propose the use of Pseudo Zernike 
Moments (PZM) as a local descriptor of leaf form for 
efficient features extraction. Using the local descriptor 
instead of global allows more efficient feature extraction. 
The local features array extracted from a partitioned 
image for each partition. The image is represented by ah 
features descriptors of ah partitions. Image descriptors 
are then classified and the image’s pixels are segmented 
into different regions based on classification results 
obtained by K-means algorithm [6] . 

The rest of this paper is organized as follows: in 
section 2, we present the related work. Section 3 gives a 
presentation of Pseudo Zernike Moments. Then, the 
proposed method is described in section 4. The section 5, 
presents some results and discussions. Linally, the paper 
is concluded in section 6. 


II. Work Related 

The plant identification process has recently been a 
subject of interest for many recent studies. Pew of them 
consider the problem of leaf extraction from the 
background. In [7], the authors propose a Leaf snap 
system for Automatic Plant Species Identification, they 
use the Expectation Maximization (EM) algorithm to 
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classify each pixel in image by estimating foreground and 
background color distributions. For scan pictures, the 
Otsu segmentation algorithm [8] is used in [9], the 
segmented image contains two classes of pixels 
foreground and background. In [10], for gray level 
images the maximally stable extremal regions algorithm 
is used for the segmentation of a single object over 
background, the algorithm computes a scan in depth, and 
then detects an object to be segmented when a stable 
number of connected components are reached. Arora et al. 
[11], propose to use preprocessing techniques for shadow 
removal, they performed Otsu threshold on the saturation 
space to give the shadow-free image. Arai et al. [12] 
propose another system to identify plants from, they 
combine between shape descriptors from Dyadic wavelet 
transformation and Zernike complex moments. 

Many works on leaf identification have been focusing 
on the feature extraction and classification shapes. For 
leaf shape description two approaches can be used: the 
first is based on the contours and the second is based on 
regions [1]. The importance of leaf margins for plant 
identification requires the use of effective methods for the 
detection of different border’s types [13]. It is clear that a 
good description using the contours requires a good 
extraction of the outline of the object that is in it a major 
segmentation problem. On the other side the contour 
based descriptor extracts features only from boundary, 
then it loses the important information carried by the 
region inside [14]. 

For the region approach the internal details of the 
borders are considered. Then, the shape is described by 
features extracted from the whole image [14]. Most 
commonly used methods as form descriptors are 
moments invariant like Hu [15], Zernike moments [16] 
and Pseudo Zernike Moments [17]. 

Hu moments are seven derived moments, easy to 
compute, but they don’t accurately present an image[14]. 
Pseudo Zernike Moments allow a better representation of 
the features; they are more robust to noise than Zernike 
moments [18] and more effective since the characteristics 
described by lower levels of MPZ are better than other 
moments, such as Zernike moments [19]. PZM is 
considered very effective image descriptors, used for 
recognition as the construction of the images [18]. Pseudo 
Zernike Moments (PZM) provide a unique description of 
an object regardless of transformations such as rotation or 
translation [17]. PZM allows multilevel representation of 
the image due to the property of orthogonal with less 
redundancy information, robust to noise, they are rotation 
invariants since just the magnitude is used [20]. 


Pseudo Zernike Moment of order p and repetition q, 
calculated for a 2D image of size N*N having the 
intensity function / (r, 6) is given by the following 
equation: 

PZM « =^JL (1) 

Where V* q (x, y) is the complex conjugate of the 
complex Pseudo Zernike y ^ polynomials (v, y), which 
can be separated into two functions? 


Where: 

• R pq (r): Radial polynomial on polar coordinats ( r, 
0 ). 

• e^ q ° : Angular function, 
e iqd = (cos 6 + j sin 0f . 

• p : Moments order, anon-negative integer. 

• q\ Moments repetitions, integer 0 < \q\ < p . Only 

the positive values are used since negative values 
can be calculated using the complex conjugate: 

PZM p _ q =PZM* pq . 






j : Imaginary number j = V— T . 

6 : angle between the vector r and axis X 


6 - tan 


f-1 


et 6 e [0, 2 n\ 


• r. Length of the vector from the origin (x, j) to 

pixel (x, y). r = pZ+y 1 . 

• R p : is calculated by the equation : 




(2p + l-i)! t 


(3) 


The image is described by a vector comprising the 
PZM for all orders and repetitions: 


III. PZM as a Form Descriptor 

Pseudo Zernike Moments are widely used as an image 
descriptor for object recognition. Originally proposed by 
Teh and Chin[17], Pseudo Zernike Moments are 
orthogonal moments used as a kernel for the Pseudo 
Zernike polynomials defined within a unit circle with 
polar coordinates. PZM are the projection of the image 
intensity function to Pseudo Zernike polynomials. 


VI = {PZM pq },p = Q,-,p^,q = 0,-,p (4) 

Since PZM pq are complex numbers and it’s always 
easy to manipulate real numbers; PZM pq are usually 
divided into two parts: real PZM pq and imaginary 
PZM s p q [21] [1 1] . 
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PZM ; g =&pl][^Rjr) cos{qd)f{r,d)drd0 (5) 
PZM;, q =^% 2 + ^R M (r)sm(q0)f(r,0)drd0 (O 

The discrete form of Pseudo Zernike moments is given 
by the following equation: 


Where: 

• x,y : are the pixel coordinates before normalization. 

• D =N : Case of unit circle within the images. 

• D — N-^2 : Case of image normalized within the 
unit circle. 


pzmjju >0) = ^ i: 0 z;: 0 <x xu ^ ^ 

PZM of order p contains (/? — l) 2 linearly 

independent polynomials lower or equal to p orders. 
Different Polynomials of different orders corresponding 
to the different image characteristics, this advantage is 
due to the orthogonally of Pseudo Zernike polynomials. 
The moments of different orders can be calculated 
independently of each other, each one has different 
information with almost no redundancy. 

Pseudo Zernike moments are defined in polar 
coordinates in a unit circle; then the pixels of square 
image have to be normalized to the interval [0, 1], 

x 2 + y 2 <l. 

The normalization is done by a linear transformation of 
pixel coordinates to polar system, where the center of the 
image is taken as the origin of the circle. 

There are two possibilities for the normalization of the 
image: 

• The circle within the image : the unit circle is 
mapped within the image. The pixels outside the 
circle are ignored and will not be taken into account 
when calculating the PZM. 

• Image within the circle: the entire image is included 
in the circle, and no information will be lost since 
all pixels are taken into account when calculating 
PZM [22]. 



Fig.l. Image normalization methods, (a) circle within the image, (b) 
image within the circle 

The normalized coordinates (x c ,y c ) inside the unit circle 
are given by: 


2x + l -N 2y + l-N 

x, = , = ■ 


D 


D 


( 8 ) 


IV. PZM Based Segmentation Method 

In this section, we present in detail our proposed 
approach of segmentation to extract leaf without shadow. 

Plant leaves images segmentation is a process of two 
phases: the first relates to feature extraction and the 
second consists of classifying the pixels of the image 
based on the results from the first phase. In our case we 
start by image partitioning and normalization technique, 
and then we compute PZM’s descriptors. 

A. Image Partition and Normalization 

The image RGB is firstly converted to grayscale image. 
After color space conversion the image is partitioned into 
windows, for each the PZM will be computed. 
Partitioning provides better local feature extraction. 



Fig. 2. Image partition 


For the image I of size N xM the windows are of 
equal size W xW and without recovery. The total 
number of windows is obtained by: 

NBwidth = — , NBlength = — (9) 

W W 

NBblock = NBwidth x NBlength (10) 

The window size is W, estimated by experimental 
results, the value size is W=4 gives the best compromise 
between execution time and description quality. 

A window in the partitioned image can be located by 

two coordinates (x,y) where X E [0, NBlength — i] and 
>’ e [0, NBwidth -l], the image intensity function / at 
the pixel (x , y . ) is given by the following equation: 

/'■ (\ - Vj ) - /UV'A + A- . HY + V . ) (11) 

NBlength - ^ 

After partitioning the image, the coordinates of each 
pixel are normalized to a polar coordinate space, where 
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each block of the image is mapped within a unit circle. 
The choice of this normalization technique is justified by 
the preservation of information because all pixels are 
taken into account when calculating the moments. 

B. Features Extraction 

The features extraction step is performed by 
calculating the PZM for each window of the partitioned 
images. 



Computing PZM’s 


PZM™ 

PZM™ 

PZM™ 

PZM% 



PZM% 

PZMJJ 

PZM“ 




Global Image Descriptor 

PZM™ 

PZM™ 

PZM ™ 

PZM ™ 

PZM™ 

PZM « 

PZM™ 

PZM™ 

PZM™ 


Fig. 3. Image Global Descriptor calculation for one channel 
partitioned image using PZM x y 


Since the PZM are rotation invariants only the 
magnitude will be considered as a feature. 

The RGB image is divided into three color channel R, 
G and B. each channel is treated independently. After 
calculating the descriptors of all windows of each channel 
by following the same steps described above for an image 
with one channel. A global descriptor of a window at 
position (x, y) is constructed from the three descriptors of 
the three channel windows lying in the same position. 




Global Image Descriptor 
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Fig.4. Calculating a partitioned RGB image descriptors using PZM x y 


C. Clustering 

The image descriptors are then classified with K- 
means algorithm [6]. The k-means algorithm is one of the 
most and popular clustering algorithms, it is known for its 
simplicity, efficiency and speed. K-means algorithm has 
been used in many applications and can be easily used in 
image segmentation. 

The goal of the algorithm consists in gathering 
descriptors in clusters, and maximizes the similarity 
between descriptors in the same cluster. Let be X={Xl, 
X2,..., Xn) the set of n descriptors represented by a set of 
data points of dimension d , to be clustered into K 
clusters with means p 2 , ju 2 , ... , ju k . The K-means 
algorithm produces a partition such that the squared error 
between the mean of a cluster and all data in the cluster is 
minimized, the goal is to minimize the sum of the 
squared error (SSE) over all K clusters. 

< 12 > 

k = 1 x t g C k 

Optimization of this objective is known as a NP- 
complete problem [23]. The main steps of K-means 
algorithm are as follows: 

1 . Select k data points as initial cluster centroids. 

2. For each data point of the whole data set, compute 
the clustering criterion function with each centroid. 
Assign the data point to its closest cluster 
centroids. 

3. Recalculate k centroids based on the data points 
assigned to them. 

4. Repeat steps 2 and 3 until convergence. 

It is obvious in this description that the result is 
influenced the desired number of clusters k. In our study, 
different initialization values were used for k. For 
scanned images the k values varied between 2 and 4. For 
scan-like images higher values were used. Thereafter the 
image is segmented according to the classification result. 


V. Results and Discussions 

For testing the presented method we use Pl@ntLeaves ^ 
database, containing more than 5436 images of more than 
70 plants. It is included in the ImageCLEF 2012 Plant 
Identification Task project. 

The images contained in the database are categorized 
into three types: Scanned Images, Scan-Like images 
(photographed with a uniform white background) and 
photographed images (in the tree with a natural 
background). 


1 http ://imedia-ftp. inria.fr: 8080/imageclef20 12/ 
ImageCLEF20 1 2PlantIdentificationT askFinalPackage. zip 
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Fig.5. Different types of images in Pl@ntLeaves database images 


The Pl@ntLeaves database contains 3070 scanned 
images, 897 scan-like images and 1469 photographed 
images. 

For the experimental results both type scan and scan- 
like images were used. The Fig. 6. shows PZM based on 
the segmentation results of one channel images. 



Fig. 6. Segmentation results of grayscale images using Pseudo Zernike 
Moments. (a) results for scanned images, (b) results for scan-like images. 

The images are firstly mapped to the grayscale image, 
then several orders of moments were tested and order 
P max = 4 was held at the end to have a quality 
compromise between performance and execution time. 



The segmentation results produced are generally good. 
In Fig. 7. the color space shows the best results for the 
segmentation of the scanned images. However for scan- 
like images light variance affects the segmentation results 
and produces worse results. 

The exploitation of the information carried by the three 
channels improves the results of image segmentation 
using Pseudo Zernike moments. The Fig. 7. shows some 
examples of segmentation results. 

The results are compared to those produced by other 
methods based on different shape descriptors as the 
Neutrosophic sets [24], entropy and even multi-level 
thresholding with the same classification algorithm K- 
means. 

Neutrosophic based segmentation is performed on 
RGB images, were each channel is transformed to the 
Neutrosophic domain. For eliminating the indeterminacy 
we use two methods a-mean and (3-enhancement 
proposed by sengur [25]. The true subsets of the three 
channel are then classified using K-means. 

Entropy based segmentation is performed by firstly 
eliminating the background using Otsu algorithm [8] that 
result a black and white image used as a binary mask 
image to extract the leaf and shade from the background. 
Each pixel not belonging to the background is considered 
as the center of the window of size W * W for which the 
entropy is calculated then the global descriptor is 
classified. 

For Multilevel thresholding segmentation also a binary 
mask is used for extracting the leaf and shade, then 
algorithm proposed by Arora [26] is applied on the 
masked image. The figures (Fig. 8. and Fig. 9.) shows 
segmentation results of both scanned and scan-like 
images by the different methods. 

Segmentation results of PZM one three channel images 
are the best, the neutrosophic sets produces very similar 
results. The results of both segmentation methods based 
on entropy and multi-level thresholding are very sensible 
to light variation. 



w m m m m 0 


Fig.7. Segmentation results of RGB images using Pseudo Zernike 
Moments, (a) results for scanned images, (b) results for scan-like 
images 


Fig. 8. Segmentation results of scanned images by the different methods, 
(a) the original images, (b) MPZ 1 channel, (c) MPZ 3 channel, (d) 
Neutrosophic sets, (e) entropy, (f) multi-level thresholding. 
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(a) (b) (c) (d) (e) (f) 

Fig. 9. Segmentation results of Scan-like images, (a) the original images, 
(b) MPZ 1 channel, (c) MPZ 3 channel, (d) Neutrosophic sets, (e) 
entropy, (f) multi-level thresholding. 


We have also compared our method with the method 
proposed in [7] which is an improvement of the EM 
algorithm (Expectation Maximization). EM algorithm 
judged by several studies [7] [27] as the most effective 
segmentation algorithm for leaf images. 

Fig. 10. shows a comparison of results obtained by our 
method and those of the method in [7] . 



Fig. 10. Comparison of segmentation results (a) The original image, (b) 
Segmentation results by the EM algorithm, (c) segmentation results of 
one channel images using Pseudo Zernike Moments, (d) Segmentation 
results of three channel images using Pseudo Zernike Moments. 



Fig. 11. Segmentation results (a) the original image, (b) the segmentation 
result of the image by the modified EM algorithm, (c) the result of 
segmentation using Pseudo Zernike Moments of three channel images. 


The last line shows that our method improves the 
results produced with less sensitivity to change of 


luminance. 

PZM based segmentation of three channel images 
shows better results compared to those presented by the 
EM modified method [7]. The following figure shows an 
example of results improvement. 

On the other hand, the average segmentation time 
(feature extraction and classification) of the different 
methods for the images tests is given in Fig. 12. 


18 

16 

Jj 14 
.a 12 
0 



2 

0 



PZM's 1 channel PZMs 3 channels Neutrosophic sets Entropy Muki-level thresholding 
Fig. 12. Average time elapsed by different methods. 


In addition to the computational speed of PZM based 
segmentation it generates small descriptors that allow a 
faster segmentation. 

Time segmentation obtained by Pseudo Zernike 
moments is the fastest for both scanned images and scan- 
like images. The Neutrosophic sets based segmentation 
approach is the slowest compared to the others 


VI. Conclusion 

In this paper, we presented the problem of identifying 
plants through the shape of their leaves. We aim to 
extract the leaf from its background, which is a 
challenging task due to the noise produced by the 
luminance variation or shadow of the leaf itself. 

Our goal was to exploit the power of Pseudo Zernike 
moments as shape descriptors for better features 

extraction of leaf images. We propose the use of PZM as 
a local form descriptor of leaf form for efficient feature 
extraction. The image’s descriptors are then classified 
and the image’s pixels are segmented into different 
regions based on classification results, for the 

classification we have used k-means and its variant 
bisecting k-means for their simplicity and quality of 
classes produced. 

We evaluated the proposed approach on varieties of 
images, the quality of the obtained results is very 

effective and correct. The segmentation results using the 

proposed approach are better than Neutrosophic, Entropy 
and Multilevel thresholding methods. 

As perspectives we intend to expand our research and 
improve our segmentation method for photographed 
images where acquisition conditions and background are 
more complex. 
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